A 068 Ozone : Integrating Structured and Semistructured Data ?
نویسندگان
چکیده
Applications have an increasing need to manage semistructured data (such as XML) along with conventional structured data. We extend the structured object database model ODMG and its query language OQL with the ability to handle semistructured data based on the OEM model and Lorel language, and we implement our extensions in a system called Ozone. In our approach, structured data, such as typed objects or relations, may contain entry points to semi-structured data, and vice-versa. The uniied representation and querying of such \hybrid" data is the main contribution of our work. We retain strong typing and access to all properties of structured portions of the data while still allowing untyped access to semistructured portions of the data. Ozone also enhances both ODMG/OQL and OEM/Lorel by virtue of their combination. For example, ordering in ODMG allows Ozone to provide ordering in OEM (enabling correct modeling of XML). Furthermore, untyped OEM semantics can optionally be applied to suspend type-checking on ODMG data, allowing Ozone to support semistructured-style navigation of structured data. Ozone is implemented as a wrapper on top of the ODMG-compliant O 2 database system, and it fully supports our extensions to the ODMG model and OQL.
منابع مشابه
Ozone: Integrating Structured and Semistructured Data
Applications have an increasing need to manage semistructured data (such as data encoded in XML) along with conventional structured data. We extend the structured object database model ODMG and its query language OQL with the ability to handle semistructured data based on the OEM model and Lorel language, and we implement our extensions in a system called Ozone. In our approach, structured data...
متن کاملIntegrating Diverse Information Management Systems: A Brief Survey
Most current information management systems can be classified into text retrieval systems, relational/object database systems, or semistructured/XML database systems. However, in practice, many applications data sets involve a combination of free text, structured data, and semistructured data. Hence, integration of different types of information management systems has been, and continues to be,...
متن کاملTowards a Comprehensive Methodological Framework for Semantic Integration of Heterogeneous Data Sources
Nowadays, data can be represented and stored by using different formats ranging from non structured data, typical of file systems, to semistructured data, typical of Web sources, to highly structured data, typical of relational database systems. Therefore, the necessity arises to define new models and approaches for uniformly handling datasources having different formats and structures, and obt...
متن کاملExtraction of Tag Tree Patterns with Contractible Variables from Irregular Semistructured Data
Information Extraction from semistructured data becomes more and more important. In order to extract meaningful or interesting contents from semistructured data, we need to extract common structured patterns from semistructured data. Many semistructured data have irregularities such as missing or erroneous data. A tag tree pattern is an edge labeled tree with ordered children which has tree str...
متن کاملMiro Web: Integrating Multiple Data Sources through Semistructured Data Types
The MIROWeb Esprit project has developed a unique technology to integrate multiple data sources through an object-relational model with semistructured data types. It addresses the problem of integrating irregular Web sources and regular relational databases through a mediated architecture based on a hybrid model, supporting relational, object and semistructured features. The project data exchan...
متن کامل